January 31, 2010 05:08:20pm
Reader software for IA texts: A proposal
Now that Apple has announced its tablet computer, Amazon says that they will open up the Kindle to outside developers, and other companies are developing or have already released similar devices intended, at least in part, for reading books, I hope that some ambitious software developers will create applications for those devices specifically designed for organizing and reading books found at the Internet Archive and other online repositories. My dream application, for a device like the Apple iPad, would
* enable the user to search for, download, and display books from the Internet Archive, Project Gutenberg, etc.;
* be able to handle a wide range of file types, including PDF, DJVU, HTML, plain text, etc.;
* store the downloaded books within the device using some easy-to-understand interface;
* enable the user to rename and otherwise edit the metadata for the locally stored books while still maintaining the link to the permanent online URL (in order to allow the user to correct mistakes and to distinguish among the many books with similar or identical title fields);
* allow users to add bookmarks, notes, internal and external links, bibliographical data, and other addenda to the locally stored books;
* allow users to sync their downloaded libraries and associated data among multiple devices, including conventional desktop and laptop computers and smartphones, and to share their libraries and data with others;
* allow full-text searching of books that have associated text data;
* allow users to edit and correct the OCR and other text data for locally stored books;
* allow smart margin reduction (that is, the application would automatically detect empty space around the edges of scanned pages and trim the images to maximize the size of the text, with the amount of remaining visible margin to be set by the user);
* allow smart line leveling (that is, the application would automatically straighten scanned pages on which the lines of text are at an oblique angle);
* be able to detect and read the page numbers on scanned book pages and make pages within the books searchable by those numbers (thus making indexes and other cross-references within scanned volumes more easily usable).
I'm sure there are other features that would be helpful as well, but these are the ones that come immediately to mind. Software developers, please get to work.